Hurricane Ike Maximum Water Levels¶
Compute maximum water levels during Hurricane Ike on a 9 million node triangular mesh ADCIRC storm surge model. Visualize the results using HoloViz TriMesh rendering with Datashader.
import fsspec
import numpy as np
import pandas as pd
import xarray as xr
Start a dask cluster to crunch the data¶
from dask.distributed import Client
from dask_gateway import Gateway
gateway = Gateway()
cluster = gateway.new_cluster()
# from dask_kubernetes import KubeCluster
# cluster = KubeCluster()
cluster
cluster.adapt(minimum=4, maximum=20);
client = Client(cluster)
Read the data using the cloud-friendly zarr data format¶
ds = xr.open_zarr(
fsspec.get_mapper('s3://pangeo-data-uswest2/esip/adcirc/ike', anon=False, requester_pays=True)
)
# ds = xr.open_zarr(fsspec.get_mapper('gcs://pangeo-data/rsignell/adcirc_test01'))
ds['zeta']
<xarray.DataArray 'zeta' (time: 720, node: 9228245)>
dask.array<zarr, shape=(720, 9228245), dtype=float64, chunksize=(10, 141973), chunktype=numpy.ndarray>
Coordinates:
* time (time) datetime64[ns] 2008-09-05T12:00:00 ... 2008-09-10T11:50:00
x (node) float64 dask.array<chunksize=(141973,), meta=np.ndarray>
y (node) float64 dask.array<chunksize=(141973,), meta=np.ndarray>
Dimensions without coordinates: node
Attributes:
location: node
long_name: water surface elevation above geoid
mesh: adcirc_mesh
standard_name: sea_surface_height_above_geoid
units: mxarray.DataArray
'zeta'
- time: 720
- node: 9228245
- dask.array<chunksize=(10, 141973), meta=np.ndarray>
Array Chunk Bytes 53.15 GB 11.36 MB Shape (720, 9228245) (10, 141973) Count 4681 Tasks 4680 Chunks Type float64 numpy.ndarray - time(time)datetime64[ns]2008-09-05T12:00:00 ... 2008-09-...
- base_date :
- 2008-09-01 11:50:00
- long_name :
- model time
- standard_name :
- time
array(['2008-09-05T12:00:00.000000000', '2008-09-05T12:10:00.000000000', '2008-09-05T12:20:00.000000000', ..., '2008-09-10T11:30:00.000000000', '2008-09-10T11:40:00.000000000', '2008-09-10T11:50:00.000000000'], dtype='datetime64[ns]') - x(node)float64dask.array<chunksize=(141973,), meta=np.ndarray>
- long_name :
- longitude
- positive :
- east
- standard_name :
- longitude
- units :
- degrees_east
Array Chunk Bytes 73.83 MB 1.14 MB Shape (9228245,) (141973,) Count 66 Tasks 65 Chunks Type float64 numpy.ndarray - y(node)float64dask.array<chunksize=(141973,), meta=np.ndarray>
- long_name :
- latitude
- positive :
- north
- standard_name :
- latitude
- units :
- degrees_north
Array Chunk Bytes 73.83 MB 1.14 MB Shape (9228245,) (141973,) Count 66 Tasks 65 Chunks Type float64 numpy.ndarray
- location :
- node
- long_name :
- water surface elevation above geoid
- mesh :
- adcirc_mesh
- standard_name :
- sea_surface_height_above_geoid
- units :
- m
How many GB of sea surface height data do we have?
ds['zeta'].nbytes / 1.0e9
53.1546912
Take the maximum over the time dimension and persist the data on the workers to use later. This is the computationally intensive step.
max_var = ds['zeta'].max(dim='time').persist()
Visualize data on mesh using HoloViz.org tools¶
import cartopy.crs as ccrs
import datashader as dshade
import geoviews as gv
import holoviews as hv
import holoviews.operation.datashader as dshade
import hvplot.xarray
import numpy as np
dshade.datashade.precompute = True
hv.extension('bokeh')
/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/core/element.py:74: SyntaxWarning: "is" with a literal. Did you mean "=="?
if key is ():
/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/core/layout.py:225: SyntaxWarning: "is" with a literal. Did you mean "=="?
if key is ():
/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/element/tabular.py:60: SyntaxWarning: "is" with a literal. Did you mean "=="?
if heading is ():
v = np.vstack((ds['x'], ds['y'], max_var)).T
verts = pd.DataFrame(v, columns=['x', 'y', 'vmax'])
points = gv.operation.project_points(gv.Points(verts, vdims=['vmax']))
tris = pd.DataFrame(ds['element'].values.astype('int') - 1, columns=['v0', 'v1', 'v2'])
tiles = gv.tile_sources.OSM
value = 'max water level'
label = f'{value} (m)'
trimesh = gv.TriMesh((tris, points), label=label)
mesh = dshade.rasterize(trimesh).opts(cmap='rainbow', colorbar=True, width=600, height=400)
tiles * mesh
Extract a time series at a specified lon, lat location¶
Because Xarray does not yet understand that x and y are coordinate variables on this triangular mesh, we create our own simple function to find the closest point. If we had a lot of these, we could use a more fancy tree algorithm.
# find the indices of the points in (x,y) closest to the points in (xi,yi)
def nearxy(x, y, xi, yi):
ind = np.ones(len(xi), dtype=int)
for i in range(len(xi)):
dist = np.sqrt((x - xi[i]) ** 2 + (y - yi[i]) ** 2)
ind[i] = dist.argmin()
return ind
# just offshore of Galveston
lat = 29.2329856
lon = -95.1535041
ind = nearxy(ds['x'].values, ds['y'].values, [lon], [lat])
ds['zeta'][:, ind].hvplot(x='time', grid=True)